88 research outputs found

    Using Learned Conditional Distributions as Edit Distance

    No full text
    International audienc

    A log square average case algorithm to make insertions in fast similarity search

    Get PDF
    To speed up similarity based searches many indexing techniques have been proposed in order to address the problem of efficiency. However, most of the proposed techniques do not admit fast insertion of new elements once the index is built. The main effect is that changes in the environment are very costly to be taken into account. In this work, we propose a new technique to allow fast insertions of elements in a family of static tree-based indexes. Unlike other techniques, the resulting index is exactly equal to the index that would be obtained by building it from scratch. Therefore there is no performance degradation in search time. We show that the expected number of distance computations (and the average time complexity) is bounded by a function that grows with log2(n) where n is the size of the database. In order to check the correctness of our approach some experiments with artificial and real data are carried out.This work has been supported in part by Grants TIN2009-14205-C04-01 from the Spanish CICYT (Ministerio de Ciencia e Innovación), the IST Programme of the European Community, under the Pascal Network of Excellence, IST-2002-506778, and the program CONSOLIDER INGENIO 2010 (CSD2007-00018)

    Learning Multipicity Tree Automata

    No full text
    International audienceIn this paper, we present a theoretical approach for the problem of learning multiplicity tree automata. These automata allows one to define functions which compute a number for each tree. They can be seen as a strict generalization of stochastic tree automata since they allow to define functions over any field K. A multiplicity automaton admits a support which is a non deterministic automaton. From a grammatical inference point of view, this paper presents a contribution which is original due to the combination of two important aspects. This is the first time, as far as we now, that a learning method focuses on non deterministic tree automata which computes functions over a field. The algorithm proposed in this paper stands in Angluin's exact model where a learner is allowed to use membership and equivalence queries. We show that this algorithm is polynomial in time in function of the size of the representation

    Recognition of pen-based music notation with finite-state machines

    Get PDF
    This work presents a statistical model to recognize pen-based music compositions using stroke recognition algorithms and finite-state machines. The series of strokes received as input is mapped onto a stochastic representation, which is combined with a formal language that describes musical symbols in terms of stroke primitives. Then, a Probabilistic Finite-State Automaton is obtained, which defines probabilities over the set of musical sequences. This model is eventually crossed with a semantic language to avoid sequences that does not make musical sense. Finally, a decoding strategy is applied in order to output a hypothesis about the musical sequence actually written. Comprehensive experimentation with several decoding algorithms, stroke similarity measures and probability density estimators are tested and evaluated following different metrics of interest. Results found have shown the goodness of the proposed model, obtaining competitive performances in all metrics and scenarios considered.This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939) and the Spanish Ministerio de Economía y Competitividad through the TIMuL Project (No. TIN2013-48152-C2-1-R, supported by UE FEDER funds)

    Recognition of online handwritten music symbols

    Get PDF
    Paper submitted to MML 2013, 6th International Workshop on Machine Learning and Music, Prague, September 23, 2013.An effective way of digitizing a new musical composition is to use an e-pen and tablet application in which the user's pen strokes are recognized online and the digital score is created with the sole effort of the composition itself. This work aims to be a starting point for research on the recognition of online handwritten music notation. To this end, different alternatives within the two modalities of recognition resulting from this data are presented: online recognition, which uses the strokes marked by a pen, and offline recognition, which uses the image generated after drawing the symbol. A comparative experiment with common machine learning algorithms over a dataset of 3800 samples and 32 different music symbols is presented. Results show that samples of the actual user are needed if good classification rates are pursued. Moreover, algorithms using the online data, on average, achieve better classification results than the others

    Impact of the initialization in tree-based fast similarity search techniques

    Get PDF
    Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot tree-based indexes emphasizes in building balanced trees. We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.The authors thank the Spanish CICyT for partial support of this work through projects TIN2009-14205-C04-C1, the Ist Programme of the European Community, under the Pascal Network of Excellence, (Ist– 2006-216886), and the program Consolider Ingenio 2010 (Csd2007-00018)

    An efficient approach for Interactive Sequential Pattern Recognition

    Get PDF
    Interactive Pattern Recognition (IPR) is an emergent framework in which the user is involved actively in the recognition process by giving feedback to the system when an error is detected. Although this framework is expected to reduce the number of errors to correct, it may increase the time required to complete the task since the machine needs to recompute its proposal after each interaction. Therefore, a fast computation is required to make the interactive system profitable and user-friendly. This work presents an efficient approach to deal with IPR tasks when data has a sequential nature. Our approach includes some computation at the very beginning of the task but it then achieves a linear complexity after user corrections. We also show how these tasks can be effectively carried out if the solution space is defined with a Regular Language. This fact has indeed proven to be the most relevant factor to improve the efficiency of the approach. Several experiments are carried out in which our proposal is faced against a classical search. Results show a reduction in time in all experiments considered, solving efficiently some complex IPR tasks thanks to our proposals.This work was partially supported by the Spanish Ministerio de Educación, Cultura y Deporte through FPU fellowship (AP2012-0939) and the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R, supported by UE FEDER funds)

    Staff-line detection and removal using a convolutional neural network

    Get PDF
    Staff-line removal is an important preprocessing stage for most optical music recognition systems. Common procedures to solve this task involve image processing techniques. In contrast to these traditional methods based on hand-engineered transformations, the problem can also be approached as a classification task in which each pixel is labeled as either staff or symbol, so that only those that belong to symbols are kept in the image. In order to perform this classification, we propose the use of convolutional neural networks, which have demonstrated an outstanding performance in image retrieval tasks. The initial features of each pixel consist of a square patch from the input image centered at that pixel. The proposed network is trained by using a dataset which contains pairs of scores with and without the staff lines. Our results in both binary and grayscale images show that the proposed technique is very accurate, outperforming both other classifiers and the state-of-the-art strategies considered. In addition, several advantages of the presented methodology with respect to traditional procedures proposed so far are discussed.This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939), the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds) and the Instituto Universitario de Investigación Informática (IUII) from the University of Alicante
    • …
    corecore